Supplementary Material: Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis
ثبت نشده
چکیده
1 2 ∆ φ t = φ t − φ 1 , ∆ ψ t = ψ t − ψ 1 , ∆φ t = φ t − φ 1 , ∆ψ t = ψ t − ψ 1 Further, we define cos x (u, v) = u Sxv uxvx , the cosine of the angle between two vectors induced by the inner product u, v = u S x v. Similarly, we define cos y (u, v) = u Syv uyvy. To prove the theorem, we will repeatedly use the following two lemmas. Proof of Lemma 1. The proof is in the main paper. Lemma 2. ∆φ t x ≤ 1 λ1 2 1+cosx(φ t ,φ1) ∆ φ t x and ∆ψ t y ≤ 1 λ1 2 1+cosy(ψ t ,ψ1) ∆ ψ t y Proof of Lemma 2. Notice that cos x (φ t , φ 1) = cos x (φ t , φ 1), then ∆ φ t 2 x = φ t − φ 1 2 x ≥ φ 1 2 sin 2 x (φ t , φ 1) = λ 2 1 sin 2 x (φ t , φ 1) Also notice that φ t x = φ 1 x = 1, which implies cos x (φ t , φ 1) = 1 − φ t − φ 1 2 x /2 = 1 − ∆φ t 2 x /2. Further ∆ φ t 2 x ≥ λ 2 1 sin 2 x (φ t , φ 1) = λ 2 1 (1 − cos 2 x (φ t , φ 1)) = λ 2 1 2 ∆φ t 2 x (1 + cos x (φ t , φ 1)) Square root both sides, ∆φ t x ≤ 1 λ 1 2 1 + cos x (φ t , φ 1) ∆ φ t x Similar argument will show that ∆ψ t y ≤ 1 λ 1 2 1 + cos y (ψ t , ψ 1) ∆ ψ t y 1.1. Proof of Theorem 2.1 Without loss of generality, we can always assume cos x (φ t , φ 1), cos y (ψ t , ψ 1) ≥ 0 because the canonical vectors are only identifiable up to a flip in sign and we can always choose φ 1 , ψ 1 such that the cosines are nonnegative. Apply simple algebra to the gradient …
منابع مشابه
Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis
Canonical Correlation Analysis (CCA) is a widely used spectral technique for finding correlation structures in multi-view datasets. In this paper, we tackle the problem of large scale CCA, where classical algorithms, usually requiring computing the product of two huge matrices and huge matrix decomposition, are computationally and storage expensive. We recast CCA from a novel perspective and pr...
متن کاملLearning Mixtures of Multi-Output Regression Models by Correlation Clustering for Multi-View Data
In many datasets, different parts of the data may have their own patterns of correlation, a structure that can be modeled as a mixture of local linear correlation models. The task of finding these mixtures is known as correlation clustering. In this work, we propose a linear correlation clustering method for datasets whose features are pre-divided into two views. The method, called Canonical Le...
متن کاملMulti-View Canonical Correlation Analysis
Canonical correlation analysis (CCA) is a method for finding linear relations between two multidimensional random variables. This paper presents a generalization of the method to more than two variables. The approach is highly scalable, since it scales linearly with respect to the number of training examples and number of views (standard CCA implementations yield cubic complexity). The method i...
متن کاملCorrelation Pattern between Temperatur, Humidity and Precipitaion by using Functional Canonical Correlation
Understanding dependence structure and relationship between two sets of variables is of main interest in statistics. When encountering two large sets of variables, a researcher can express the relationship between the two sets by extracting only finite linear combinations of the original variables that produce the largest correlations with the second set of variables. When data are con...
متن کاملIdentification of Risk Factors by Using Macroeconomic and Firm-Specific Variables Simultaneously in Tehran Stock Exchange by Applying Canonical Correlation Analysis
The main objective of this study is to give the insight of describing mixing accounting ratios and macroeconomic variables as the risk factors in Iran. The results indicate a significant relationship between book to market ratio, financial leverage, size factors and expected stock returns in the Iranian market. In consistent with the other studies, we came to the conclusion that the term struct...
متن کامل